Goto

Collaborating Authors

 lattice layer




Hierarchical Lattice Layer for Partially Monotone Neural Networks

Neural Information Processing Systems

Partially monotone regression is a regression analysis in which the target values are monotonically increasing with respect to a subset of input features. The TensorFlow Lattice library is one of the standard machine learning libraries for partially monotone regression. It consists of several neural network layers, and its core component is the lattice layer. One of the problems of the lattice layer is that it requires the projected gradient descent algorithm with many constraints to train it. Another problem is that it cannot receive a high-dimensional input vector due to the memory consumption. We propose a novel neural network layer, the hierarchical lattice layer (HLL), as an extension of the lattice layer so that we can use a standard stochastic gradient descent algorithm to train HLL while satisfying monotonicity constraints and so that it can receive a high-dimensional input vector.


Smooth Monotonic Networks

Igel, Christian

arXiv.org Artificial Intelligence

Monotonicity constraints are powerful regularizers in statistical modelling. They can support fairness in computer supported decision making and increase plausibility in data-driven scientific models. The seminal min-max (MM) neural network architecture ensures monotonicity, but often gets stuck in undesired local optima during training because of vanishing gradients. We propose a simple modification of the MM network using strictly-increasing smooth non-linearities that alleviates this problem. The resulting smooth min-max (SMM) network module inherits the asymptotic approximation properties from the MM architecture. It can be used within larger deep learning systems trained end-to-end. The SMM module is considerably simpler and less computationally demanding than state-of-the-art neural networks for monotonic modelling. Still, in our experiments, it compared favorably to alternative neural and non-neural approaches in terms of generalization performance.


Adding Common Sense to Machine Learning with TensorFlow Lattice

#artificialintelligence

Training-serving skew: The offline numbers may look great, but what if your model will be evaluated on a different or broader set of examples than those found in the training set? This phenomenon, more generally referred to as "dataset shift" or "distribution shift", happens all the time in real-world situations. Models are trained on a curated set of examples, or clicks on top-ranked recommendations, or a specific geographical region, and then applied to every user or use case. Curiosities and anomalies in your training and testing data become genuine and sustained loss patterns. Bad individual errors: Models are often judged by their worst behavior --- a single egregious outcome can damage the faith that important stakeholders have in the model and even cause serious reputational harm to your business or institution.